Index | Exploratory Data Analysis | NLP | ML | Conclusion
Conclusion
Fig. 1a & 1b - Positive Word Cloud on Elon Musk, Negative Word Cloud on Elon Musk
In conclusion, with the help of NLP processing, we are able to summarize words clouds that separates positivity and negativity thoughts on important social media keyholders, such as Elon Musk. Sentiment analyses are carried out carefully to not only dissect controversies and popular thoughts on Elon Musk, but also reputation, comment quality and also timelines that may associate views on Elon Musk, Tesla, Twitter, and other important topics from the 2021 to 2022 time period.
This has successful solved multiple business goals listed in the beginning of the project and has provided insight on how to possible proceed with the current findings. For example, more analysis could be done on other famous tech companies or from other fields to compare and contrast the popularities of these key stakeholders. In addition, we could possibly tie in the data from Reddit and also other stock markets to predict stock values with sentiments distributions of these key stakeholders.
Fig. 3 - Feature Importance of Classification on Controversiality
With the models for prediction of controversiality, submission scores and others, we are able to accurately predict and create different indexes for the above variables. We have gained more understanding in the important variables that can affect them, such as flair, subreddit type, body length and score. As these variables have higher feature importance, they can be used more frequently in predicting other interesting values with the Reddit data. For example, the prediction of controversiality within a particular subreddit versus another, and the prediction of submission score in comments with certain sentiments. The mixes and matches can be endless and can show visualizations of different combinations for more in-depth comparisons.